Skip to main content

Arquivo.pt: the Portuguese web-archive

Arquivo.pt - The Portuguese web-archive (PWA) is the national Web archive of Portugal. Its mission is to periodically archive contents of national interest available on the Web, storing and preserving for future generations information of historical relevance. It is a service of the Foundation for Science and Technology (FCT).

10,841
RESULTS
rss


Media Type
10,841
web
Year
1,288
2017
2,572
2016
2,518
2015
886
2014
618
2013
273
2012
More right-solid
Topics & Subjects
10,828
Portuguese Web Archive
10,828
Portuguese online publications
5,947
Complete crawl of the Portuguese web
4,881
Incremental crawl of the Portuguese web
2,552
2016
2,538
2015
More right-solid
Collection
Creator
10,841
portuguese web archive
Language
10,841
Portuguese
SHOW DETAILS
up-solid down-solid
eye
Title
Date Reviewed
Creator
Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 30 May 2016 and 3 August 2016 mainly from .PT domain. The AWP21 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP20 as baseline. Thus, the files that remained unchanged from the AWP20 complete crawl were not archived (duplicated) on the AWP21 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 12 November 2015 and 5 January 2015 mainly from .PT domain. The AWP19 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP18 as baseline. Thus, the files that remained unchanged from the AWP18 complete crawl were not archived (duplicated) on the AWP19 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 12 November 2015 and 5 January 2015 mainly from .PT domain. The AWP19 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP18 as baseline. Thus, the files that remained unchanged from the AWP18 complete crawl were not archived (duplicated) on the AWP19 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 31 October 2016 and 4 January 2017 mainly from .PT domain. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 1 January 2017 and 7 May 2017 mainly from .PT domain. The AWP23 crawl did NOT use DeDuplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 1 January 2017 and 7 May 2017 mainly from .PT domain. The AWP23 crawl did NOT use DeDuplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web
eye 17,705
favorite 0
comment 0
Complete crawl of the Portuguese web performed in October 2009 mainly from .PT domain..
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web
eye 47,489
favorite 0
comment 0
Complete crawl of the Portuguese web performed in October 2009 mainly from .PT domain..
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web
eye 2,528
favorite 0
comment 0
Complete crawl of the Portuguese web performed in December 2009 mainly from .PT domain.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web
eye 1,971
favorite 0
comment 0
Complete crawl of the Portuguese web performed in December 2009 mainly from .PT domain.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed in May 2010 mainly from .PT domain.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed in May 2010 mainly from .PT domain.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web
eye 49,070
favorite 0
comment 0
Complete crawl of the Portuguese web performed in May 2010 mainly from .PT domain.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed in August 2010 mainly from .PT domain. The AWP8 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP7 as baseline. Thus, the files that remained unchanged from the AWP7 complete crawl were not archived (duplicated) on the AWP8 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web
eye 1,796
favorite 0
comment 0
Complete crawl of the Portuguese web performed in February and March 2008 mainly from .PT domain..
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 10 April 2015 and 9 June 2015 mainly from .PT domain. The AWP17 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 10 April 2015 and 9 June 2015 mainly from .PT domain. The AWP17 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 10 April 2015 and 9 June 2015 mainly from .PT domain. The AWP17 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 10 April 2015 and 9 June 2015 mainly from .PT domain. The AWP17 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 10 April 2015 and 9 June 2015 mainly from .PT domain. The AWP17 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 10 April 2015 and 9 June 2015 mainly from .PT domain. The AWP17 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 10 April 2015 and 9 June 2015 mainly from .PT domain. The AWP17 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...